Enable checksum validation for presigned URL downloads#7035
Enable checksum validation for presigned URL downloads#7035jencymaryjoseph wants to merge 2 commits into
Conversation
74d413d to
f3cc9d7
Compare
| } | ||
| for (String param : query.split("&")) { | ||
| if (param.startsWith("X-Amz-SignedHeaders=")) { | ||
| return param.substring("X-Amz-SignedHeaders=".length()).contains("x-amz-checksum-mode"); |
There was a problem hiding this comment.
I can't remember exactly how all of the signing works, but are these guaranteed to be lower case?
There was a problem hiding this comment.
Yes, V4CanonicalRequest.getCanonicalHeaders() lowercases all header names per the SigV4 specification, so X-Amz-SignedHeaders always contains lowercase values.
| if (query == null) { | ||
| return false; | ||
| } | ||
| for (String param : query.split("&")) { |
There was a problem hiding this comment.
I think the query parameters here are URL encoded - I'm trying to think through if there are any cases where we would end up with something invalid because we're not actually decoding these first.
There was a problem hiding this comment.
We use uri.getQuery() which returns the decoded query string, so parsing operates on decoded values.
| private static final HttpChecksum RESPONSE_CHECKSUM_CONFIG = HttpChecksum.builder() | ||
| .requestValidationMode("ENABLED") | ||
| .responseAlgorithmsV2( | ||
| DefaultChecksumAlgorithm.XXHASH3, |
There was a problem hiding this comment.
I'm a little worried about the hard coding of these algorithms here. It would be easy for us to forget to update this with new algorithms.... is there any way we could create it dynamically?
There was a problem hiding this comment.
We can add a new public method in DefaultChecksumAlgorithm to return all checksum algorithms.
Motivation and Context
S3 returns checksum headers (x-amz-checksum-crc32, x-amz-checksum-crc64nvme, etc.) only when the request includes x-amz-checksum-mode: ENABLED as an HTTP header. For presigned URL downloads, this header was not being sent because:
HttpChecksumValidationInterceptor) check request instanceof GetObjectRequest, but presigned URL downloads usePresignedUrlDownloadRequestWrapper— a different type — so the interceptor short-circuits.This meant S3 never returned checksums for presigned URL downloads, making data integrity validation impossible.
Modifications
This change adds checksum header auto-detection in the request marshaller and enables response checksum validation through the existing SDK interceptor pipeline.
PresignedUrlDownloadRequestMarshaller):DefaultAsyncPresignedUrlExtension):HttpChecksumValidationInterceptorto validate response checksums. On mismatch, throwsSdkClientException.AsyncPresignedUrlExtension(interface-level): Documents checksum validation behavior and limitations.S3Presigner.presignGetObject(): Documents that enabling checksum mode makes the URL non-browser-executable.Testing
PresignedUrlDownloadRequestMarshallerTest): Verifies header is added when x-amz-checksum-mode is in SignedHeaders, and not added otherwise.PresignedUrlChecksumValidationWiremockTest): Verifies matching checksum succeeds, mismatching checksum throws SdkClientException, and missing checksum skips validation.DefaultAsyncPresignedUrlExtensionTest): Verifies download completes successfully when checksum headers are present.AsyncPresignedUrlExtensionTestSuite): Verifies against live S3 that checksums are returned when checksum mode is enabled and not returned otherwise.Screenshots (if appropriate)
Types of changes
Checklist
mvn installsucceedsscripts/new-changescript and following the instructions. Commit the new file created by the script in.changes/next-releasewith your changes.License